Overview

Dataset statistics

Number of variables29
Number of observations2085494
Missing cells17849110
Missing cells (%)29.5%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory2.0 GiB
Average record size in memory1.0 KiB

Variable types

DateTime2
Categorical6
Unsupported1
Numeric8
Text12

Alerts

NUMBER OF PEDESTRIANS KILLED is highly imbalanced (99.6%)Imbalance
NUMBER OF CYCLIST INJURED is highly imbalanced (92.3%)Imbalance
NUMBER OF CYCLIST KILLED is highly imbalanced (99.9%)Imbalance
CONTRIBUTING FACTOR VEHICLE 4 is highly imbalanced (90.8%)Imbalance
CONTRIBUTING FACTOR VEHICLE 5 is highly imbalanced (89.9%)Imbalance
BOROUGH has 648930 (31.1%) missing valuesMissing
ZIP CODE has 649184 (31.1%) missing valuesMissing
LATITUDE has 234300 (11.2%) missing valuesMissing
LONGITUDE has 234300 (11.2%) missing valuesMissing
LOCATION has 234300 (11.2%) missing valuesMissing
ON STREET NAME has 443414 (21.3%) missing valuesMissing
CROSS STREET NAME has 789605 (37.9%) missing valuesMissing
OFF STREET NAME has 1734453 (83.2%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 2 has 324030 (15.5%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 3 has 1936332 (92.8%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 4 has 2051785 (98.4%) missing valuesMissing
CONTRIBUTING FACTOR VEHICLE 5 has 2076357 (99.6%) missing valuesMissing
VEHICLE TYPE CODE 2 has 399976 (19.2%) missing valuesMissing
VEHICLE TYPE CODE 3 has 1941769 (93.1%) missing valuesMissing
VEHICLE TYPE CODE 4 has 2052958 (98.4%) missing valuesMissing
VEHICLE TYPE CODE 5 has 2076637 (99.6%) missing valuesMissing
LATITUDE is highly skewed (γ1 = -20.39832654)Skewed
NUMBER OF PERSONS KILLED is highly skewed (γ1 = 33.61661618)Skewed
NUMBER OF MOTORIST KILLED is highly skewed (γ1 = 54.56006989)Skewed
COLLISION_ID has unique valuesUnique
ZIP CODE is an unsupported type, check if it needs cleaning or further analysisUnsupported
NUMBER OF PERSONS INJURED has 1607054 (77.1%) zerosZeros
NUMBER OF PERSONS KILLED has 2082457 (99.9%) zerosZeros
NUMBER OF PEDESTRIANS INJURED has 1972076 (94.6%) zerosZeros
NUMBER OF MOTORIST INJURED has 1780333 (85.4%) zerosZeros
NUMBER OF MOTORIST KILLED has 2084302 (99.9%) zerosZeros

Reproduction

Analysis started2024-05-07 03:12:49.209694
Analysis finished2024-05-07 03:13:48.986679
Duration59.78 seconds
Software versionydata-profiling vv4.7.0
Download configurationconfig.json

Variables

Distinct4325
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size15.9 MiB
Minimum2012-07-01 00:00:00
Maximum2024-05-03 00:00:00
2024-05-06T23:13:49.031557image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:49.087833image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
Distinct1440
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size15.9 MiB
Minimum2024-05-06 00:00:00
Maximum2024-05-06 23:59:00
2024-05-06T23:13:49.298173image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:49.352006image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

BOROUGH
Categorical

MISSING 

Distinct5
Distinct (%)< 0.1%
Missing648930
Missing (%)31.1%
Memory size127.9 MiB
BROOKLYN
457165 
QUEENS
385197 
MANHATTAN
321417 
BRONX
212475 
STATEN ISLAND
60310 

Length

Max length13
Median length9
Mean length7.4536603
Min length5

Characters and Unicode

Total characters10707660
Distinct characters19
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowBROOKLYN
2nd rowBROOKLYN
3rd rowBRONX
4th rowBROOKLYN
5th rowMANHATTAN

Common Values

ValueCountFrequency (%)
BROOKLYN 457165
21.9%
QUEENS 385197
18.5%
MANHATTAN 321417
15.4%
BRONX 212475
 
10.2%
STATEN ISLAND 60310
 
2.9%
(Missing) 648930
31.1%

Length

2024-05-06T23:13:49.404331image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-06T23:13:49.454643image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
brooklyn 457165
30.5%
queens 385197
25.7%
manhattan 321417
21.5%
bronx 212475
14.2%
staten 60310
 
4.0%
island 60310
 
4.0%

Most occurring characters

ValueCountFrequency (%)
N 1818291
17.0%
O 1126805
10.5%
A 1084871
10.1%
E 830704
 
7.8%
T 763454
 
7.1%
R 669640
 
6.3%
B 669640
 
6.3%
L 517475
 
4.8%
S 505817
 
4.7%
Y 457165
 
4.3%
Other values (9) 2263798
21.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 10707660
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
N 1818291
17.0%
O 1126805
10.5%
A 1084871
10.1%
E 830704
 
7.8%
T 763454
 
7.1%
R 669640
 
6.3%
B 669640
 
6.3%
L 517475
 
4.8%
S 505817
 
4.7%
Y 457165
 
4.3%
Other values (9) 2263798
21.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 10707660
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
N 1818291
17.0%
O 1126805
10.5%
A 1084871
10.1%
E 830704
 
7.8%
T 763454
 
7.1%
R 669640
 
6.3%
B 669640
 
6.3%
L 517475
 
4.8%
S 505817
 
4.7%
Y 457165
 
4.3%
Other values (9) 2263798
21.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 10707660
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
N 1818291
17.0%
O 1126805
10.5%
A 1084871
10.1%
E 830704
 
7.8%
T 763454
 
7.1%
R 669640
 
6.3%
B 669640
 
6.3%
L 517475
 
4.8%
S 505817
 
4.7%
Y 457165
 
4.3%
Other values (9) 2263798
21.1%

ZIP CODE
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing649184
Missing (%)31.1%
Memory size75.1 MiB

LATITUDE
Real number (ℝ)

MISSING  SKEWED 

Distinct126738
Distinct (%)6.8%
Missing234300
Missing (%)11.2%
Infinite0
Infinite (%)0.0%
Mean40.627378
Minimum0
Maximum43.344444
Zeros4396
Zeros (%)0.2%
Negative0
Negative (%)0.0%
Memory size15.9 MiB
2024-05-06T23:13:49.508727image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile40.596612
Q140.667774
median40.720764
Q340.769612
95-th percentile40.86204
Maximum43.344444
Range43.344444
Interquartile range (IQR)0.1018378

Descriptive statistics

Standard deviation1.9837524
Coefficient of variation (CV)0.04882797
Kurtosis414.76605
Mean40.627378
Median Absolute Deviation (MAD)0.0513402
Skewness-20.398327
Sum75209158
Variance3.9352735
MonotonicityNot monotonic
2024-05-06T23:13:49.558659image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4396
 
0.2%
40.861862 889
 
< 0.1%
40.696033 771
 
< 0.1%
40.8047 692
 
< 0.1%
40.608757 671
 
< 0.1%
40.798256 627
 
< 0.1%
40.759308 625
 
< 0.1%
40.6960346 587
 
< 0.1%
40.675735 559
 
< 0.1%
40.658577 523
 
< 0.1%
Other values (126728) 1840854
88.3%
(Missing) 234300
 
11.2%
ValueCountFrequency (%)
0 4396
0.2%
30.78418 1
 
< 0.1%
34.783634 1
 
< 0.1%
40.498947 1
 
< 0.1%
40.4989488 2
 
< 0.1%
40.4991346 1
 
< 0.1%
40.49931 1
 
< 0.1%
40.4994787 1
 
< 0.1%
40.499659 1
 
< 0.1%
40.49971 1
 
< 0.1%
ValueCountFrequency (%)
43.344444 1
 
< 0.1%
42.64154 1
 
< 0.1%
42.318317 1
 
< 0.1%
42.107204 1
 
< 0.1%
41.91661 1
 
< 0.1%
41.34796 1
 
< 0.1%
41.258785 1
 
< 0.1%
41.12615 5
< 0.1%
41.12421 1
 
< 0.1%
41.061634 2
 
< 0.1%

LONGITUDE
Real number (ℝ)

MISSING 

Distinct98436
Distinct (%)5.3%
Missing234300
Missing (%)11.2%
Infinite0
Infinite (%)0.0%
Mean-73.751547
Minimum-201.35999
Maximum0
Zeros4396
Zeros (%)0.2%
Negative1846798
Negative (%)88.6%
Memory size15.9 MiB
2024-05-06T23:13:49.609226image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum-201.35999
5-th percentile-74.03613
Q1-73.97483
median-73.92726
Q3-73.866731
95-th percentile-73.76325
Maximum0
Range201.35999
Interquartile range (IQR)0.1080989

Descriptive statistics

Standard deviation3.7281252
Coefficient of variation (CV)-0.05054979
Kurtosis439.11784
Mean-73.751547
Median Absolute Deviation (MAD)0.052606
Skewness16.106495
Sum-1.3652842 × 108
Variance13.898918
MonotonicityNot monotonic
2024-05-06T23:13:49.664290image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 4396
 
0.2%
-73.89063 768
 
< 0.1%
-73.91282 719
 
< 0.1%
-73.98453 700
 
< 0.1%
-74.038086 672
 
< 0.1%
-73.89686 659
 
< 0.1%
-73.91243 654
 
< 0.1%
-73.94476 590
 
< 0.1%
-73.9845292 587
 
< 0.1%
-73.9112 581
 
< 0.1%
Other values (98426) 1840868
88.3%
(Missing) 234300
 
11.2%
ValueCountFrequency (%)
-201.35999 1
 
< 0.1%
-201.23706 105
< 0.1%
-89.13527 1
 
< 0.1%
-86.76847 1
 
< 0.1%
-79.61955 1
 
< 0.1%
-79.00183 1
 
< 0.1%
-76.2634 1
 
< 0.1%
-76.02163 1
 
< 0.1%
-74.742 7
 
< 0.1%
-74.25496 1
 
< 0.1%
ValueCountFrequency (%)
0 4396
0.2%
-32.768513 16
 
< 0.1%
-47.209625 3
 
< 0.1%
-73.66301 1
 
< 0.1%
-73.70055 2
 
< 0.1%
-73.700584 11
 
< 0.1%
-73.7005968 10
 
< 0.1%
-73.70061 4
 
< 0.1%
-73.70071 4
 
< 0.1%
-73.70073 1
 
< 0.1%

LOCATION
Text

MISSING 

Distinct284545
Distinct (%)15.4%
Missing234300
Missing (%)11.2%
Memory size148.0 MiB
2024-05-06T23:13:49.822441image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length25
Median length24
Mean length22.774418
Min length10

Characters and Unicode

Total characters42159866
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique156445 ?
Unique (%)8.5%

Sample

1st row(40.667202, -73.8665)
2nd row(40.683304, -73.917274)
3rd row(40.709183, -73.956825)
4th row(40.86816, -73.83148)
5th row(40.67172, -73.8971)
ValueCountFrequency (%)
0.0 8792
 
0.2%
40.861862 889
 
< 0.1%
40.696033 771
 
< 0.1%
73.89063 768
 
< 0.1%
73.91282 719
 
< 0.1%
73.98453 700
 
< 0.1%
40.8047 692
 
< 0.1%
74.038086 672
 
< 0.1%
40.608757 671
 
< 0.1%
73.89686 659
 
< 0.1%
Other values (225163) 3687055
99.6%
2024-05-06T23:13:50.034945image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7 4617330
11.0%
4 4000028
 
9.5%
. 3702388
 
8.8%
3 3515270
 
8.3%
0 3417232
 
8.1%
9 2712408
 
6.4%
8 2661399
 
6.3%
6 2629407
 
6.2%
5 2104583
 
5.0%
( 1851194
 
4.4%
Other values (6) 10948627
26.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 42159866
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
7 4617330
11.0%
4 4000028
 
9.5%
. 3702388
 
8.8%
3 3515270
 
8.3%
0 3417232
 
8.1%
9 2712408
 
6.4%
8 2661399
 
6.3%
6 2629407
 
6.2%
5 2104583
 
5.0%
( 1851194
 
4.4%
Other values (6) 10948627
26.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 42159866
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
7 4617330
11.0%
4 4000028
 
9.5%
. 3702388
 
8.8%
3 3515270
 
8.3%
0 3417232
 
8.1%
9 2712408
 
6.4%
8 2661399
 
6.3%
6 2629407
 
6.2%
5 2104583
 
5.0%
( 1851194
 
4.4%
Other values (6) 10948627
26.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 42159866
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
7 4617330
11.0%
4 4000028
 
9.5%
. 3702388
 
8.8%
3 3515270
 
8.3%
0 3417232
 
8.1%
9 2712408
 
6.4%
8 2661399
 
6.3%
6 2629407
 
6.2%
5 2104583
 
5.0%
( 1851194
 
4.4%
Other values (6) 10948627
26.0%

ON STREET NAME
Text

MISSING 

Distinct18458
Distinct (%)1.1%
Missing443414
Missing (%)21.3%
Memory size149.1 MiB
2024-05-06T23:13:50.148837image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length32
Median length32
Mean length29.56421
Min length2

Characters and Unicode

Total characters48546798
Distinct characters75
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6551 ?
Unique (%)0.4%

Sample

1st rowWHITESTONE EXPRESSWAY
2nd rowQUEENSBORO BRIDGE UPPER
3rd rowTHROGS NECK BRIDGE
4th rowSARATOGA AVENUE
5th rowMAJOR DEEGAN EXPRESSWAY RAMP
ValueCountFrequency (%)
avenue 610782
 
16.1%
street 522974
 
13.8%
east 154098
 
4.1%
boulevard 127527
 
3.4%
west 115215
 
3.0%
parkway 75148
 
2.0%
road 68379
 
1.8%
expressway 63767
 
1.7%
island 30625
 
0.8%
queens 27288
 
0.7%
Other values (5394) 1993059
52.6%
2024-05-06T23:13:50.311888image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
27572212
56.8%
E 3689040
 
7.6%
A 1960159
 
4.0%
T 1839600
 
3.8%
R 1677642
 
3.5%
N 1434455
 
3.0%
S 1414317
 
2.9%
U 981985
 
2.0%
O 872965
 
1.8%
V 855815
 
1.8%
Other values (65) 6248608
 
12.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 48546798
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
27572212
56.8%
E 3689040
 
7.6%
A 1960159
 
4.0%
T 1839600
 
3.8%
R 1677642
 
3.5%
N 1434455
 
3.0%
S 1414317
 
2.9%
U 981985
 
2.0%
O 872965
 
1.8%
V 855815
 
1.8%
Other values (65) 6248608
 
12.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 48546798
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
27572212
56.8%
E 3689040
 
7.6%
A 1960159
 
4.0%
T 1839600
 
3.8%
R 1677642
 
3.5%
N 1434455
 
3.0%
S 1414317
 
2.9%
U 981985
 
2.0%
O 872965
 
1.8%
V 855815
 
1.8%
Other values (65) 6248608
 
12.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 48546798
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
27572212
56.8%
E 3689040
 
7.6%
A 1960159
 
4.0%
T 1839600
 
3.8%
R 1677642
 
3.5%
N 1434455
 
3.0%
S 1414317
 
2.9%
U 981985
 
2.0%
O 872965
 
1.8%
V 855815
 
1.8%
Other values (65) 6248608
 
12.9%

CROSS STREET NAME
Text

MISSING 

Distinct20256
Distinct (%)1.6%
Missing789605
Missing (%)37.9%
Memory size122.6 MiB
2024-05-06T23:13:50.449371image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length32
Median length32
Mean length22.670409
Min length1

Characters and Unicode

Total characters29378334
Distinct characters76
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6202 ?
Unique (%)0.5%

Sample

1st row20 AVENUE
2nd rowDECATUR STREET
3rd rowEAST 43 STREET
4th rowEAST GATE PLAZA
5th rowwest 80 street -west 81 street
ValueCountFrequency (%)
avenue 567479
 
19.8%
street 461146
 
16.1%
east 112564
 
3.9%
west 71342
 
2.5%
boulevard 68928
 
2.4%
road 55777
 
1.9%
place 34080
 
1.2%
parkway 26719
 
0.9%
3 18826
 
0.7%
park 17492
 
0.6%
Other values (5485) 1431852
50.0%
2024-05-06T23:13:50.641690image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
14121512
48.1%
E 2948298
 
10.0%
T 1458774
 
5.0%
A 1425079
 
4.9%
R 1151727
 
3.9%
N 1079107
 
3.7%
S 992513
 
3.4%
U 780282
 
2.7%
V 711591
 
2.4%
O 580789
 
2.0%
Other values (66) 4128662
 
14.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 29378334
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
14121512
48.1%
E 2948298
 
10.0%
T 1458774
 
5.0%
A 1425079
 
4.9%
R 1151727
 
3.9%
N 1079107
 
3.7%
S 992513
 
3.4%
U 780282
 
2.7%
V 711591
 
2.4%
O 580789
 
2.0%
Other values (66) 4128662
 
14.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 29378334
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
14121512
48.1%
E 2948298
 
10.0%
T 1458774
 
5.0%
A 1425079
 
4.9%
R 1151727
 
3.9%
N 1079107
 
3.7%
S 992513
 
3.4%
U 780282
 
2.7%
V 711591
 
2.4%
O 580789
 
2.0%
Other values (66) 4128662
 
14.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 29378334
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
14121512
48.1%
E 2948298
 
10.0%
T 1458774
 
5.0%
A 1425079
 
4.9%
R 1151727
 
3.9%
N 1079107
 
3.7%
S 992513
 
3.4%
U 780282
 
2.7%
V 711591
 
2.4%
O 580789
 
2.0%
Other values (66) 4128662
 
14.1%

OFF STREET NAME
Text

MISSING 

Distinct227639
Distinct (%)64.8%
Missing1734453
Missing (%)83.2%
Memory size84.0 MiB
2024-05-06T23:13:50.796908image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length40
Median length40
Mean length35.917465
Min length8

Characters and Unicode

Total characters12608503
Distinct characters84
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique177508 ?
Unique (%)50.6%

Sample

1st row1211 LORING AVENUE
2nd row344 BAYCHESTER AVENUE
3rd row2047 PITKIN AVENUE
4th row480 DEAN STREET
5th row878 FLATBUSH AVENUE
ValueCountFrequency (%)
avenue 139121
 
11.9%
street 126988
 
10.9%
east 33467
 
2.9%
west 24196
 
2.1%
boulevard 22279
 
1.9%
road 16574
 
1.4%
lot 7881
 
0.7%
parking 7267
 
0.6%
parkway 6996
 
0.6%
of 6954
 
0.6%
Other values (27625) 775853
66.4%
2024-05-06T23:13:50.996173image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6887772
54.6%
E 803446
 
6.4%
T 439996
 
3.5%
A 411834
 
3.3%
R 342330
 
2.7%
N 300915
 
2.4%
S 288305
 
2.3%
1 279274
 
2.2%
U 204657
 
1.6%
V 190938
 
1.5%
Other values (74) 2459036
 
19.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 12608503
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
6887772
54.6%
E 803446
 
6.4%
T 439996
 
3.5%
A 411834
 
3.3%
R 342330
 
2.7%
N 300915
 
2.4%
S 288305
 
2.3%
1 279274
 
2.2%
U 204657
 
1.6%
V 190938
 
1.5%
Other values (74) 2459036
 
19.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 12608503
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
6887772
54.6%
E 803446
 
6.4%
T 439996
 
3.5%
A 411834
 
3.3%
R 342330
 
2.7%
N 300915
 
2.4%
S 288305
 
2.3%
1 279274
 
2.2%
U 204657
 
1.6%
V 190938
 
1.5%
Other values (74) 2459036
 
19.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 12608503
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
6887772
54.6%
E 803446
 
6.4%
T 439996
 
3.5%
A 411834
 
3.3%
R 342330
 
2.7%
N 300915
 
2.4%
S 288305
 
2.3%
1 279274
 
2.2%
U 204657
 
1.6%
V 190938
 
1.5%
Other values (74) 2459036
 
19.5%

NUMBER OF PERSONS INJURED
Real number (ℝ)

ZEROS 

Distinct32
Distinct (%)< 0.1%
Missing18
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.31104122
Minimum0
Maximum43
Zeros1607054
Zeros (%)77.1%
Negative0
Negative (%)0.0%
Memory size15.9 MiB
2024-05-06T23:13:51.062125image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum43
Range43
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.70104436
Coefficient of variation (CV)2.2538632
Kurtosis50.982476
Mean0.31104122
Median Absolute Deviation (MAD)0
Skewness4.2497531
Sum648669
Variance0.49146319
MonotonicityNot monotonic
2024-05-06T23:13:51.105175image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=32)
ValueCountFrequency (%)
0 1607054
77.1%
1 371328
 
17.8%
2 69944
 
3.4%
3 22850
 
1.1%
4 8466
 
0.4%
5 3249
 
0.2%
6 1361
 
0.1%
7 579
 
< 0.1%
8 255
 
< 0.1%
9 130
 
< 0.1%
Other values (22) 260
 
< 0.1%
ValueCountFrequency (%)
0 1607054
77.1%
1 371328
 
17.8%
2 69944
 
3.4%
3 22850
 
1.1%
4 8466
 
0.4%
5 3249
 
0.2%
6 1361
 
0.1%
7 579
 
< 0.1%
8 255
 
< 0.1%
9 130
 
< 0.1%
ValueCountFrequency (%)
43 1
 
< 0.1%
40 1
 
< 0.1%
34 1
 
< 0.1%
32 1
 
< 0.1%
31 1
 
< 0.1%
27 1
 
< 0.1%
25 1
 
< 0.1%
24 3
< 0.1%
23 1
 
< 0.1%
22 3
< 0.1%

NUMBER OF PERSONS KILLED
Real number (ℝ)

SKEWED  ZEROS 

Distinct7
Distinct (%)< 0.1%
Missing31
Missing (%)< 0.1%
Infinite0
Infinite (%)0.0%
Mean0.0015003862
Minimum0
Maximum8
Zeros2082457
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size15.9 MiB
2024-05-06T23:13:51.143161image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum8
Range8
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.040839716
Coefficient of variation (CV)27.219468
Kurtosis1922.4521
Mean0.0015003862
Median Absolute Deviation (MAD)0
Skewness33.616616
Sum3129
Variance0.0016678824
MonotonicityNot monotonic
2024-05-06T23:13:51.182124image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
0 2082457
99.9%
1 2913
 
0.1%
2 75
 
< 0.1%
3 12
 
< 0.1%
4 3
 
< 0.1%
5 2
 
< 0.1%
8 1
 
< 0.1%
(Missing) 31
 
< 0.1%
ValueCountFrequency (%)
0 2082457
99.9%
1 2913
 
0.1%
2 75
 
< 0.1%
3 12
 
< 0.1%
4 3
 
< 0.1%
5 2
 
< 0.1%
8 1
 
< 0.1%
ValueCountFrequency (%)
8 1
 
< 0.1%
5 2
 
< 0.1%
4 3
 
< 0.1%
3 12
 
< 0.1%
2 75
 
< 0.1%
1 2913
 
0.1%
0 2082457
99.9%

NUMBER OF PEDESTRIANS INJURED
Real number (ℝ)

ZEROS 

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.056723731
Minimum0
Maximum27
Zeros1972076
Zeros (%)94.6%
Negative0
Negative (%)0.0%
Memory size15.9 MiB
2024-05-06T23:13:51.223208image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum27
Range27
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.2443857
Coefficient of variation (CV)4.3083502
Kurtosis127.90885
Mean0.056723731
Median Absolute Deviation (MAD)0
Skewness5.6654502
Sum118297
Variance0.059724369
MonotonicityNot monotonic
2024-05-06T23:13:51.262585image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
0 1972076
94.6%
1 109261
 
5.2%
2 3679
 
0.2%
3 369
 
< 0.1%
4 61
 
< 0.1%
5 25
 
< 0.1%
6 11
 
< 0.1%
7 4
 
< 0.1%
9 2
 
< 0.1%
8 2
 
< 0.1%
Other values (4) 4
 
< 0.1%
ValueCountFrequency (%)
0 1972076
94.6%
1 109261
 
5.2%
2 3679
 
0.2%
3 369
 
< 0.1%
4 61
 
< 0.1%
5 25
 
< 0.1%
6 11
 
< 0.1%
7 4
 
< 0.1%
8 2
 
< 0.1%
9 2
 
< 0.1%
ValueCountFrequency (%)
27 1
 
< 0.1%
19 1
 
< 0.1%
15 1
 
< 0.1%
13 1
 
< 0.1%
9 2
 
< 0.1%
8 2
 
< 0.1%
7 4
 
< 0.1%
6 11
 
< 0.1%
5 25
< 0.1%
4 61
< 0.1%

NUMBER OF PEDESTRIANS KILLED
Categorical

IMBALANCE 

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size115.4 MiB
0
2083961 
1
 
1520
2
 
12
6
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2085494
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2083961
99.9%
1 1520
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Length

2024-05-06T23:13:51.303358image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-06T23:13:51.341345image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 2083961
99.9%
1 1520
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2083961
99.9%
1 1520
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2083961
99.9%
1 1520
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2083961
99.9%
1 1520
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2083961
99.9%
1 1520
 
0.1%
2 12
 
< 0.1%
6 1
 
< 0.1%

NUMBER OF CYCLIST INJURED
Categorical

IMBALANCE 

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size115.4 MiB
0
2030012 
1
 
54849
2
 
609
3
 
23
4
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2085494
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2030012
97.3%
1 54849
 
2.6%
2 609
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Length

2024-05-06T23:13:51.382243image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-06T23:13:51.421082image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 2030012
97.3%
1 54849
 
2.6%
2 609
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2030012
97.3%
1 54849
 
2.6%
2 609
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2030012
97.3%
1 54849
 
2.6%
2 609
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2030012
97.3%
1 54849
 
2.6%
2 609
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2030012
97.3%
1 54849
 
2.6%
2 609
 
< 0.1%
3 23
 
< 0.1%
4 1
 
< 0.1%

NUMBER OF CYCLIST KILLED
Categorical

IMBALANCE 

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size115.4 MiB
0
2085254 
1
 
239
2
 
1

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2085494
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 2085254
> 99.9%
1 239
 
< 0.1%
2 1
 
< 0.1%

Length

2024-05-06T23:13:51.463902image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-05-06T23:13:51.502825image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
ValueCountFrequency (%)
0 2085254
> 99.9%
1 239
 
< 0.1%
2 1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
0 2085254
> 99.9%
1 239
 
< 0.1%
2 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 2085254
> 99.9%
1 239
 
< 0.1%
2 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 2085254
> 99.9%
1 239
 
< 0.1%
2 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2085494
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 2085254
> 99.9%
1 239
 
< 0.1%
2 1
 
< 0.1%

NUMBER OF MOTORIST INJURED
Real number (ℝ)

ZEROS 

Distinct31
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.22369904
Minimum0
Maximum43
Zeros1780333
Zeros (%)85.4%
Negative0
Negative (%)0.0%
Memory size15.9 MiB
2024-05-06T23:13:51.546190image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum43
Range43
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.66221489
Coefficient of variation (CV)2.9602939
Kurtosis63.304937
Mean0.22369904
Median Absolute Deviation (MAD)0
Skewness5.1139917
Sum466523
Variance0.43852857
MonotonicityNot monotonic
2024-05-06T23:13:51.595773image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
0 1780333
85.4%
1 205209
 
9.8%
2 63820
 
3.1%
3 22153
 
1.1%
4 8292
 
0.4%
5 3198
 
0.2%
6 1315
 
0.1%
7 553
 
< 0.1%
8 247
 
< 0.1%
9 125
 
< 0.1%
Other values (21) 249
 
< 0.1%
ValueCountFrequency (%)
0 1780333
85.4%
1 205209
 
9.8%
2 63820
 
3.1%
3 22153
 
1.1%
4 8292
 
0.4%
5 3198
 
0.2%
6 1315
 
0.1%
7 553
 
< 0.1%
8 247
 
< 0.1%
9 125
 
< 0.1%
ValueCountFrequency (%)
43 1
 
< 0.1%
40 1
 
< 0.1%
34 1
 
< 0.1%
31 1
 
< 0.1%
30 1
 
< 0.1%
25 1
 
< 0.1%
24 3
< 0.1%
23 1
 
< 0.1%
22 2
< 0.1%
21 1
 
< 0.1%

NUMBER OF MOTORIST KILLED
Real number (ℝ)

SKEWED  ZEROS 

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.00061807898
Minimum0
Maximum5
Zeros2084302
Zeros (%)99.9%
Negative0
Negative (%)0.0%
Memory size15.9 MiB
2024-05-06T23:13:51.633288image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum5
Range5
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.027193584
Coefficient of variation (CV)43.99694
Kurtosis4196.5461
Mean0.00061807898
Median Absolute Deviation (MAD)0
Skewness54.56007
Sum1289
Variance0.000739491
MonotonicityNot monotonic
2024-05-06T23:13:51.670602image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
0 2084302
99.9%
1 1117
 
0.1%
2 59
 
< 0.1%
3 12
 
< 0.1%
4 2
 
< 0.1%
5 2
 
< 0.1%
ValueCountFrequency (%)
0 2084302
99.9%
1 1117
 
0.1%
2 59
 
< 0.1%
3 12
 
< 0.1%
4 2
 
< 0.1%
5 2
 
< 0.1%
ValueCountFrequency (%)
5 2
 
< 0.1%
4 2
 
< 0.1%
3 12
 
< 0.1%
2 59
 
< 0.1%
1 1117
 
0.1%
0 2084302
99.9%
Distinct61
Distinct (%)< 0.1%
Missing6865
Missing (%)0.3%
Memory size151.9 MiB
2024-05-06T23:13:51.740298image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length53
Median length43
Mean length19.513255
Min length1

Characters and Unicode

Total characters40560818
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAggressive Driving/Road Rage
2nd rowPavement Slippery
3rd rowFollowing Too Closely
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 709158
17.1%
driver 450464
 
10.9%
inattention/distraction 417718
 
10.1%
too 163564
 
3.9%
closely 163564
 
3.9%
to 148902
 
3.6%
failure 130215
 
3.1%
yield 123992
 
3.0%
right-of-way 123992
 
3.0%
following 111587
 
2.7%
Other values (96) 1600235
38.6%
2024-05-06T23:13:51.883427image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 4563606
 
11.3%
e 4131184
 
10.2%
n 3525907
 
8.7%
t 2814191
 
6.9%
o 2393126
 
5.9%
r 2382367
 
5.9%
s 2107618
 
5.2%
2064762
 
5.1%
a 2001035
 
4.9%
c 1562117
 
3.9%
Other values (45) 13014905
32.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 40560818
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 4563606
 
11.3%
e 4131184
 
10.2%
n 3525907
 
8.7%
t 2814191
 
6.9%
o 2393126
 
5.9%
r 2382367
 
5.9%
s 2107618
 
5.2%
2064762
 
5.1%
a 2001035
 
4.9%
c 1562117
 
3.9%
Other values (45) 13014905
32.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 40560818
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 4563606
 
11.3%
e 4131184
 
10.2%
n 3525907
 
8.7%
t 2814191
 
6.9%
o 2393126
 
5.9%
r 2382367
 
5.9%
s 2107618
 
5.2%
2064762
 
5.1%
a 2001035
 
4.9%
c 1562117
 
3.9%
Other values (45) 13014905
32.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 40560818
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 4563606
 
11.3%
e 4131184
 
10.2%
n 3525907
 
8.7%
t 2814191
 
6.9%
o 2393126
 
5.9%
r 2382367
 
5.9%
s 2107618
 
5.2%
2064762
 
5.1%
a 2001035
 
4.9%
c 1562117
 
3.9%
Other values (45) 13014905
32.1%
Distinct61
Distinct (%)< 0.1%
Missing324030
Missing (%)15.5%
Memory size127.6 MiB
2024-05-06T23:13:51.963464image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length53
Median length11
Mean length13.049252
Min length1

Characters and Unicode

Total characters22985787
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 1483034
68.6%
driver 101442
 
4.7%
inattention/distraction 94701
 
4.4%
other 33254
 
1.5%
vehicular 32189
 
1.5%
too 27894
 
1.3%
closely 27894
 
1.3%
passing 21660
 
1.0%
to 21608
 
1.0%
lane 20197
 
0.9%
Other values (96) 296990
 
13.7%
2024-05-06T23:13:52.092824image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3623508
15.8%
e 3526950
15.3%
n 2060162
9.0%
s 1765767
7.7%
c 1673714
7.3%
d 1556899
6.8%
p 1553108
6.8%
f 1539442
6.7%
U 1519713
6.6%
t 622043
 
2.7%
Other values (45) 3544481
15.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 22985787
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 3623508
15.8%
e 3526950
15.3%
n 2060162
9.0%
s 1765767
7.7%
c 1673714
7.3%
d 1556899
6.8%
p 1553108
6.8%
f 1539442
6.7%
U 1519713
6.6%
t 622043
 
2.7%
Other values (45) 3544481
15.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 22985787
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 3623508
15.8%
e 3526950
15.3%
n 2060162
9.0%
s 1765767
7.7%
c 1673714
7.3%
d 1556899
6.8%
p 1553108
6.8%
f 1539442
6.7%
U 1519713
6.6%
t 622043
 
2.7%
Other values (45) 3544481
15.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 22985787
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 3623508
15.8%
e 3526950
15.3%
n 2060162
9.0%
s 1765767
7.7%
c 1673714
7.3%
d 1556899
6.8%
p 1553108
6.8%
f 1539442
6.7%
U 1519713
6.6%
t 622043
 
2.7%
Other values (45) 3544481
15.4%
Distinct51
Distinct (%)< 0.1%
Missing1936332
Missing (%)92.8%
Memory size68.9 MiB
2024-05-06T23:13:52.177223image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length53
Median length11
Mean length11.657151
Min length1

Characters and Unicode

Total characters1738804
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified
ValueCountFrequency (%)
unspecified 139044
85.8%
other 2839
 
1.8%
vehicular 2799
 
1.7%
driver 2150
 
1.3%
too 2024
 
1.2%
closely 2024
 
1.2%
following 1970
 
1.2%
inattention/distraction 1967
 
1.2%
fatigued/drowsy 853
 
0.5%
pavement 414
 
0.3%
Other values (79) 5951
 
3.7%
2024-05-06T23:13:52.313079image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 297136
17.1%
i 295801
17.0%
n 152490
8.8%
s 146038
8.4%
c 145473
8.4%
d 141153
8.1%
p 140719
8.1%
f 139955
8.0%
U 139713
8.0%
o 17376
 
1.0%
Other values (45) 122950
7.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1738804
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 297136
17.1%
i 295801
17.0%
n 152490
8.8%
s 146038
8.4%
c 145473
8.4%
d 141153
8.1%
p 140719
8.1%
f 139955
8.0%
U 139713
8.0%
o 17376
 
1.0%
Other values (45) 122950
7.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1738804
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 297136
17.1%
i 295801
17.0%
n 152490
8.8%
s 146038
8.4%
c 145473
8.4%
d 141153
8.1%
p 140719
8.1%
f 139955
8.0%
U 139713
8.0%
o 17376
 
1.0%
Other values (45) 122950
7.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1738804
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 297136
17.1%
i 295801
17.0%
n 152490
8.8%
s 146038
8.4%
c 145473
8.4%
d 141153
8.1%
p 140719
8.1%
f 139955
8.0%
U 139713
8.0%
o 17376
 
1.0%
Other values (45) 122950
7.1%

CONTRIBUTING FACTOR VEHICLE 4
Categorical

IMBALANCE  MISSING 

Distinct41
Distinct (%)0.1%
Missing2051785
Missing (%)98.4%
Memory size127.4 MiB
Unspecified
31793 
Other Vehicular
 
623
Following Too Closely
 
392
Driver Inattention/Distraction
 
278
Fatigued/Drowsy
 
170
Other values (36)
 
453

Length

Max length43
Median length11
Mean length11.490818
Min length5

Characters and Unicode

Total characters387344
Distinct characters51
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified

Common Values

ValueCountFrequency (%)
Unspecified 31793
 
1.5%
Other Vehicular 623
 
< 0.1%
Following Too Closely 392
 
< 0.1%
Driver Inattention/Distraction 278
 
< 0.1%
Fatigued/Drowsy 170
 
< 0.1%
Pavement Slippery 119
 
< 0.1%
Reaction to Uninvolved Vehicle 42
 
< 0.1%
Unsafe Speed 32
 
< 0.1%
Outside Car Distraction 29
 
< 0.1%
Driver Inexperience 27
 
< 0.1%
Other values (31) 204
 
< 0.1%
(Missing) 2051785
98.4%

Length

2024-05-06T23:13:52.381212image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unspecified 31793
88.1%
other 632
 
1.8%
vehicular 623
 
1.7%
too 397
 
1.1%
closely 397
 
1.1%
following 392
 
1.1%
driver 305
 
0.8%
inattention/distraction 278
 
0.8%
fatigued/drowsy 170
 
0.5%
pavement 122
 
0.3%
Other values (64) 975
 
2.7%

Most occurring characters

ValueCountFrequency (%)
e 67193
17.3%
i 66571
17.2%
n 33888
8.7%
c 32970
8.5%
s 32946
8.5%
p 32161
8.3%
d 32149
8.3%
f 31921
8.2%
U 31901
8.2%
o 3097
 
0.8%
Other values (41) 22547
 
5.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 387344
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 67193
17.3%
i 66571
17.2%
n 33888
8.7%
c 32970
8.5%
s 32946
8.5%
p 32161
8.3%
d 32149
8.3%
f 31921
8.2%
U 31901
8.2%
o 3097
 
0.8%
Other values (41) 22547
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 387344
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 67193
17.3%
i 66571
17.2%
n 33888
8.7%
c 32970
8.5%
s 32946
8.5%
p 32161
8.3%
d 32149
8.3%
f 31921
8.2%
U 31901
8.2%
o 3097
 
0.8%
Other values (41) 22547
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 387344
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 67193
17.3%
i 66571
17.2%
n 33888
8.7%
c 32970
8.5%
s 32946
8.5%
p 32161
8.3%
d 32149
8.3%
f 31921
8.2%
U 31901
8.2%
o 3097
 
0.8%
Other values (41) 22547
 
5.8%

CONTRIBUTING FACTOR VEHICLE 5
Categorical

IMBALANCE  MISSING 

Distinct30
Distinct (%)0.3%
Missing2076357
Missing (%)99.6%
Memory size127.3 MiB
Unspecified
8611 
Other Vehicular
 
181
Following Too Closely
 
99
Driver Inattention/Distraction
 
65
Pavement Slippery
 
50
Other values (25)
 
131

Length

Max length43
Median length11
Mean length11.469738
Min length5

Characters and Unicode

Total characters104799
Distinct characters50
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)0.1%

Sample

1st rowUnspecified
2nd rowUnspecified
3rd rowUnspecified
4th rowUnspecified
5th rowUnspecified

Common Values

ValueCountFrequency (%)
Unspecified 8611
 
0.4%
Other Vehicular 181
 
< 0.1%
Following Too Closely 99
 
< 0.1%
Driver Inattention/Distraction 65
 
< 0.1%
Pavement Slippery 50
 
< 0.1%
Fatigued/Drowsy 41
 
< 0.1%
Reaction to Uninvolved Vehicle 12
 
< 0.1%
Alcohol Involvement 11
 
< 0.1%
Obstruction/Debris 10
 
< 0.1%
Driver Inexperience 10
 
< 0.1%
Other values (20) 47
 
< 0.1%
(Missing) 2076357
99.6%

Length

2024-05-06T23:13:52.433142image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unspecified 8611
88.2%
other 183
 
1.9%
vehicular 181
 
1.9%
too 101
 
1.0%
closely 101
 
1.0%
following 99
 
1.0%
driver 75
 
0.8%
inattention/distraction 65
 
0.7%
pavement 51
 
0.5%
slippery 50
 
0.5%
Other values (47) 251
 
2.6%

Most occurring characters

ValueCountFrequency (%)
e 18245
17.4%
i 18001
17.2%
n 9144
8.7%
c 8935
8.5%
s 8884
8.5%
p 8739
8.3%
d 8696
8.3%
f 8638
8.2%
U 8634
8.2%
o 788
 
0.8%
Other values (40) 6095
 
5.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 104799
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 18245
17.4%
i 18001
17.2%
n 9144
8.7%
c 8935
8.5%
s 8884
8.5%
p 8739
8.3%
d 8696
8.3%
f 8638
8.2%
U 8634
8.2%
o 788
 
0.8%
Other values (40) 6095
 
5.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 104799
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 18245
17.4%
i 18001
17.2%
n 9144
8.7%
c 8935
8.5%
s 8884
8.5%
p 8739
8.3%
d 8696
8.3%
f 8638
8.2%
U 8634
8.2%
o 788
 
0.8%
Other values (40) 6095
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 104799
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 18245
17.4%
i 18001
17.2%
n 9144
8.7%
c 8935
8.5%
s 8884
8.5%
p 8739
8.3%
d 8696
8.3%
f 8638
8.2%
U 8634
8.2%
o 788
 
0.8%
Other values (40) 6095
 
5.8%

COLLISION_ID
Real number (ℝ)

UNIQUE 

Distinct2085494
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3167144.8
Minimum22
Maximum4722272
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size15.9 MiB
2024-05-06T23:13:52.489563image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Quantile statistics

Minimum22
5-th percentile105128.65
Q13157493.2
median3678992.5
Q34200608.8
95-th percentile4617783.3
Maximum4722272
Range4722250
Interquartile range (IQR)1043115.5

Descriptive statistics

Standard deviation1505387.7
Coefficient of variation (CV)0.47531383
Kurtosis-0.019103384
Mean3167144.8
Median Absolute Deviation (MAD)521558
Skewness-1.226785
Sum6.6050615 × 1012
Variance2.2661922 × 1012
MonotonicityNot monotonic
2024-05-06T23:13:52.540805image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
4455765 1
 
< 0.1%
3174628 1
 
< 0.1%
3172280 1
 
< 0.1%
3160927 1
 
< 0.1%
3173224 1
 
< 0.1%
3171866 1
 
< 0.1%
3172720 1
 
< 0.1%
3162782 1
 
< 0.1%
3168818 1
 
< 0.1%
3159257 1
 
< 0.1%
Other values (2085484) 2085484
> 99.9%
ValueCountFrequency (%)
22 1
< 0.1%
23 1
< 0.1%
24 1
< 0.1%
25 1
< 0.1%
26 1
< 0.1%
27 1
< 0.1%
28 1
< 0.1%
29 1
< 0.1%
30 1
< 0.1%
31 1
< 0.1%
ValueCountFrequency (%)
4722272 1
< 0.1%
4722270 1
< 0.1%
4722268 1
< 0.1%
4722265 1
< 0.1%
4722264 1
< 0.1%
4722263 1
< 0.1%
4722260 1
< 0.1%
4722259 1
< 0.1%
4722254 1
< 0.1%
4722253 1
< 0.1%
Distinct1647
Distinct (%)0.1%
Missing13866
Missing (%)0.7%
Memory size146.4 MiB
2024-05-06T23:13:52.602847image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length38
Median length35
Mean length16.882971
Min length1

Characters and Unicode

Total characters34975236
Distinct characters75
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1001 ?
Unique (%)< 0.1%

Sample

1st rowSedan
2nd rowSedan
3rd rowSedan
4th rowSedan
5th rowDump
ValueCountFrequency (%)
vehicle 883823
18.0%
utility 637368
13.0%
station 637325
13.0%
sedan 623996
12.7%
wagon/sport 457034
9.3%
passenger 416219
8.5%
181675
 
3.7%
wagon 180355
 
3.7%
sport 180291
 
3.7%
truck 86442
 
1.8%
Other values (957) 618147
12.6%
2024-05-06T23:13:52.828627image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2844267
 
8.1%
S 2747229
 
7.9%
t 2318875
 
6.6%
i 1953175
 
5.6%
E 1819085
 
5.2%
a 1632608
 
4.7%
e 1623631
 
4.6%
n 1560172
 
4.5%
o 1447330
 
4.1%
T 1142647
 
3.3%
Other values (65) 15886217
45.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 34975236
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2844267
 
8.1%
S 2747229
 
7.9%
t 2318875
 
6.6%
i 1953175
 
5.6%
E 1819085
 
5.2%
a 1632608
 
4.7%
e 1623631
 
4.6%
n 1560172
 
4.5%
o 1447330
 
4.1%
T 1142647
 
3.3%
Other values (65) 15886217
45.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 34975236
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2844267
 
8.1%
S 2747229
 
7.9%
t 2318875
 
6.6%
i 1953175
 
5.6%
E 1819085
 
5.2%
a 1632608
 
4.7%
e 1623631
 
4.6%
n 1560172
 
4.5%
o 1447330
 
4.1%
T 1142647
 
3.3%
Other values (65) 15886217
45.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 34975236
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2844267
 
8.1%
S 2747229
 
7.9%
t 2318875
 
6.6%
i 1953175
 
5.6%
E 1819085
 
5.2%
a 1632608
 
4.7%
e 1623631
 
4.6%
n 1560172
 
4.5%
o 1447330
 
4.1%
T 1142647
 
3.3%
Other values (65) 15886217
45.4%

VEHICLE TYPE CODE 2
Text

MISSING 

Distinct1834
Distinct (%)0.1%
Missing399976
Missing (%)19.2%
Memory size129.7 MiB
2024-05-06T23:13:52.896541image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length38
Median length30
Mean length16.079394
Min length1

Characters and Unicode

Total characters27102108
Distinct characters73
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1085 ?
Unique (%)0.1%

Sample

1st rowSedan
2nd rowPick-up Truck
3rd rowSedan
4th rowTractor Truck Diesel
5th rowSedan
ValueCountFrequency (%)
vehicle 655799
17.1%
utility 468831
12.2%
station 468803
12.2%
sedan 438283
11.4%
wagon/sport 328599
8.5%
passenger 318612
8.3%
141517
 
3.7%
wagon 140257
 
3.6%
sport 140204
 
3.6%
truck 85798
 
2.2%
Other values (1011) 658039
17.1%
2024-05-06T23:13:53.018513image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2172192
 
8.0%
S 2038120
 
7.5%
t 1676594
 
6.2%
i 1440794
 
5.3%
E 1438910
 
5.3%
e 1198012
 
4.4%
a 1173113
 
4.3%
n 1114429
 
4.1%
o 1067178
 
3.9%
T 920168
 
3.4%
Other values (63) 12862598
47.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 27102108
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
2172192
 
8.0%
S 2038120
 
7.5%
t 1676594
 
6.2%
i 1440794
 
5.3%
E 1438910
 
5.3%
e 1198012
 
4.4%
a 1173113
 
4.3%
n 1114429
 
4.1%
o 1067178
 
3.9%
T 920168
 
3.4%
Other values (63) 12862598
47.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 27102108
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
2172192
 
8.0%
S 2038120
 
7.5%
t 1676594
 
6.2%
i 1440794
 
5.3%
E 1438910
 
5.3%
e 1198012
 
4.4%
a 1173113
 
4.3%
n 1114429
 
4.1%
o 1067178
 
3.9%
T 920168
 
3.4%
Other values (63) 12862598
47.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 27102108
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
2172192
 
8.0%
S 2038120
 
7.5%
t 1676594
 
6.2%
i 1440794
 
5.3%
E 1438910
 
5.3%
e 1198012
 
4.4%
a 1173113
 
4.3%
n 1114429
 
4.1%
o 1067178
 
3.9%
T 920168
 
3.4%
Other values (63) 12862598
47.5%

VEHICLE TYPE CODE 3
Text

MISSING 

Distinct263
Distinct (%)0.2%
Missing1941769
Missing (%)93.1%
Memory size69.5 MiB
2024-05-06T23:13:53.086307image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length35
Median length30
Mean length17.68192
Min length2

Characters and Unicode

Total characters2541334
Distinct characters62
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique154 ?
Unique (%)0.1%

Sample

1st rowSedan
2nd rowStation Wagon/Sport Utility Vehicle
3rd rowSedan
4th rowSedan
5th rowSedan
ValueCountFrequency (%)
vehicle 64598
18.5%
utility 49809
14.3%
station 49807
14.3%
sedan 47549
13.6%
wagon/sport 36448
10.4%
passenger 27716
7.9%
13440
 
3.9%
wagon 13359
 
3.8%
sport 13358
 
3.8%
truck 4365
 
1.3%
Other values (219) 28575
8.2%
2024-05-06T23:13:53.204562image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
205734
 
8.1%
S 201673
 
7.9%
t 183663
 
7.2%
i 151728
 
6.0%
a 124057
 
4.9%
e 123604
 
4.9%
n 121337
 
4.8%
E 116407
 
4.6%
o 112351
 
4.4%
T 77081
 
3.0%
Other values (52) 1123699
44.2%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2541334
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
205734
 
8.1%
S 201673
 
7.9%
t 183663
 
7.2%
i 151728
 
6.0%
a 124057
 
4.9%
e 123604
 
4.9%
n 121337
 
4.8%
E 116407
 
4.6%
o 112351
 
4.4%
T 77081
 
3.0%
Other values (52) 1123699
44.2%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2541334
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
205734
 
8.1%
S 201673
 
7.9%
t 183663
 
7.2%
i 151728
 
6.0%
a 124057
 
4.9%
e 123604
 
4.9%
n 121337
 
4.8%
E 116407
 
4.6%
o 112351
 
4.4%
T 77081
 
3.0%
Other values (52) 1123699
44.2%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2541334
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
205734
 
8.1%
S 201673
 
7.9%
t 183663
 
7.2%
i 151728
 
6.0%
a 124057
 
4.9%
e 123604
 
4.9%
n 121337
 
4.8%
E 116407
 
4.6%
o 112351
 
4.4%
T 77081
 
3.0%
Other values (52) 1123699
44.2%

VEHICLE TYPE CODE 4
Text

MISSING 

Distinct103
Distinct (%)0.3%
Missing2052958
Missing (%)98.4%
Memory size65.0 MiB
2024-05-06T23:13:53.268700image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length35
Median length30
Mean length17.978455
Min length2

Characters and Unicode

Total characters584947
Distinct characters57
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique47 ?
Unique (%)0.1%

Sample

1st rowStation Wagon/Sport Utility Vehicle
2nd rowSedan
3rd rowStation Wagon/Sport Utility Vehicle
4th rowSedan
5th rowSedan
ValueCountFrequency (%)
vehicle 14990
18.9%
utility 11816
14.9%
station 11816
14.9%
sedan 11508
14.5%
wagon/sport 8964
11.3%
passenger 5970
 
7.5%
2859
 
3.6%
sport 2852
 
3.6%
wagon 2852
 
3.6%
truck 804
 
1.0%
Other values (104) 5065
 
6.4%
2024-05-06T23:13:53.382903image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
47016
 
8.0%
S 46713
 
8.0%
t 45036
 
7.7%
i 36966
 
6.3%
a 30102
 
5.1%
e 29891
 
5.1%
n 29582
 
5.1%
o 27364
 
4.7%
E 24670
 
4.2%
l 18162
 
3.1%
Other values (47) 249445
42.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 584947
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
47016
 
8.0%
S 46713
 
8.0%
t 45036
 
7.7%
i 36966
 
6.3%
a 30102
 
5.1%
e 29891
 
5.1%
n 29582
 
5.1%
o 27364
 
4.7%
E 24670
 
4.2%
l 18162
 
3.1%
Other values (47) 249445
42.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 584947
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
47016
 
8.0%
S 46713
 
8.0%
t 45036
 
7.7%
i 36966
 
6.3%
a 30102
 
5.1%
e 29891
 
5.1%
n 29582
 
5.1%
o 27364
 
4.7%
E 24670
 
4.2%
l 18162
 
3.1%
Other values (47) 249445
42.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 584947
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
47016
 
8.0%
S 46713
 
8.0%
t 45036
 
7.7%
i 36966
 
6.3%
a 30102
 
5.1%
e 29891
 
5.1%
n 29582
 
5.1%
o 27364
 
4.7%
E 24670
 
4.2%
l 18162
 
3.1%
Other values (47) 249445
42.6%

VEHICLE TYPE CODE 5
Text

MISSING 

Distinct71
Distinct (%)0.8%
Missing2076637
Missing (%)99.6%
Memory size64.0 MiB
2024-05-06T23:13:53.443312image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Length

Max length35
Median length30
Mean length18.21452
Min length2

Characters and Unicode

Total characters161326
Distinct characters54
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)0.4%

Sample

1st rowStation Wagon/Sport Utility Vehicle
2nd rowStation Wagon/Sport Utility Vehicle
3rd rowSedan
4th rowSedan
5th rowStation Wagon/Sport Utility Vehicle
ValueCountFrequency (%)
vehicle 4048
18.5%
utility 3354
15.3%
station 3354
15.3%
sedan 3214
14.7%
wagon/sport 2552
11.7%
passenger 1487
 
6.8%
804
 
3.7%
wagon 804
 
3.7%
sport 802
 
3.7%
truck 248
 
1.1%
Other values (69) 1201
 
5.5%
2024-05-06T23:13:53.556393image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
13021
 
8.1%
t 12829
 
8.0%
S 12811
 
7.9%
i 10525
 
6.5%
a 8498
 
5.3%
e 8443
 
5.2%
n 8378
 
5.2%
o 7809
 
4.8%
E 6129
 
3.8%
l 5170
 
3.2%
Other values (44) 67713
42.0%

Most occurring categories

ValueCountFrequency (%)
(unknown) 161326
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
13021
 
8.1%
t 12829
 
8.0%
S 12811
 
7.9%
i 10525
 
6.5%
a 8498
 
5.3%
e 8443
 
5.2%
n 8378
 
5.2%
o 7809
 
4.8%
E 6129
 
3.8%
l 5170
 
3.2%
Other values (44) 67713
42.0%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 161326
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
13021
 
8.1%
t 12829
 
8.0%
S 12811
 
7.9%
i 10525
 
6.5%
a 8498
 
5.3%
e 8443
 
5.2%
n 8378
 
5.2%
o 7809
 
4.8%
E 6129
 
3.8%
l 5170
 
3.2%
Other values (44) 67713
42.0%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 161326
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
13021
 
8.1%
t 12829
 
8.0%
S 12811
 
7.9%
i 10525
 
6.5%
a 8498
 
5.3%
e 8443
 
5.2%
n 8378
 
5.2%
o 7809
 
4.8%
E 6129
 
3.8%
l 5170
 
3.2%
Other values (44) 67713
42.0%

Interactions

2024-05-06T23:13:33.749910image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:26.498525image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.579387image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.567648image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:29.637629image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.690271image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.712179image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:32.737922image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:33.877352image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:26.701980image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.693856image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.692744image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:29.754201image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.816248image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.843479image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:32.859700image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:34.025244image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:26.844581image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.816183image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.830521image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:29.889621image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.964801image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.987263image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:33.002492image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:34.163202image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:26.959117image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.934254image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.958986image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.015228image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.098032image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:32.123187image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:33.141057image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:34.293638image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.087785image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.067595image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:29.104866image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.154584image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.223472image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:32.251883image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:33.269177image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:34.418559image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.211059image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.196119image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:29.244014image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.292301image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.346305image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:32.370417image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:33.388387image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:34.540955image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.337078image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.321440image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:29.381739image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.428171image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.463553image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:32.490435image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:33.503350image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:34.660124image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:27.463091image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:28.446692image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:29.521544image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:30.562731image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:31.585534image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:32.612636image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
2024-05-06T23:13:33.621892image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/

Missing values

2024-05-06T23:13:35.321124image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
A simple visualization of nullity by column.
2024-05-06T23:13:37.869739image/svg+xmlMatplotlib v3.8.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

CRASH DATECRASH TIMEBOROUGHZIP CODELATITUDELONGITUDELOCATIONON STREET NAMECROSS STREET NAMEOFF STREET NAMENUMBER OF PERSONS INJUREDNUMBER OF PERSONS KILLEDNUMBER OF PEDESTRIANS INJUREDNUMBER OF PEDESTRIANS KILLEDNUMBER OF CYCLIST INJUREDNUMBER OF CYCLIST KILLEDNUMBER OF MOTORIST INJUREDNUMBER OF MOTORIST KILLEDCONTRIBUTING FACTOR VEHICLE 1CONTRIBUTING FACTOR VEHICLE 2CONTRIBUTING FACTOR VEHICLE 3CONTRIBUTING FACTOR VEHICLE 4CONTRIBUTING FACTOR VEHICLE 5COLLISION_IDVEHICLE TYPE CODE 1VEHICLE TYPE CODE 2VEHICLE TYPE CODE 3VEHICLE TYPE CODE 4VEHICLE TYPE CODE 5
009/11/20212:39NaNNaNNaNNaNNaNWHITESTONE EXPRESSWAY20 AVENUENaN2.00.0000020Aggressive Driving/Road RageUnspecifiedNaNNaNNaN4455765SedanSedanNaNNaNNaN
103/26/202211:45NaNNaNNaNNaNNaNQUEENSBORO BRIDGE UPPERNaNNaN1.00.0000010Pavement SlipperyNaNNaNNaNNaN4513547SedanNaNNaNNaNNaN
206/29/20226:55NaNNaNNaNNaNNaNTHROGS NECK BRIDGENaNNaN0.00.0000000Following Too CloselyUnspecifiedNaNNaNNaN4541903SedanPick-up TruckNaNNaNNaN
309/11/20219:35BROOKLYN11208.040.667202-73.866500(40.667202, -73.8665)NaNNaN1211 LORING AVENUE0.00.0000000UnspecifiedNaNNaNNaNNaN4456314SedanNaNNaNNaNNaN
412/14/20218:13BROOKLYN11233.040.683304-73.917274(40.683304, -73.917274)SARATOGA AVENUEDECATUR STREETNaN0.00.0000000NaNNaNNaNNaNNaN4486609NaNNaNNaNNaNNaN
504/14/202112:47NaNNaNNaNNaNNaNMAJOR DEEGAN EXPRESSWAY RAMPNaNNaN0.00.0000000UnspecifiedUnspecifiedNaNNaNNaN4407458DumpSedanNaNNaNNaN
612/14/202117:05NaNNaN40.709183-73.956825(40.709183, -73.956825)BROOKLYN QUEENS EXPRESSWAYNaNNaN0.00.0000000Passing Too CloselyUnspecifiedNaNNaNNaN4486555SedanTractor Truck DieselNaNNaNNaN
712/14/20218:17BRONX10475.040.868160-73.831480(40.86816, -73.83148)NaNNaN344 BAYCHESTER AVENUE2.00.0000020UnspecifiedUnspecifiedNaNNaNNaN4486660SedanSedanNaNNaNNaN
812/14/202121:10BROOKLYN11207.040.671720-73.897100(40.67172, -73.8971)NaNNaN2047 PITKIN AVENUE0.00.0000000Driver InexperienceUnspecifiedNaNNaNNaN4487074SedanNaNNaNNaNNaN
912/14/202114:58MANHATTAN10017.040.751440-73.973970(40.75144, -73.97397)3 AVENUEEAST 43 STREETNaN0.00.0000000Passing Too CloselyUnspecifiedNaNNaNNaN4486519SedanStation Wagon/Sport Utility VehicleNaNNaNNaN
CRASH DATECRASH TIMEBOROUGHZIP CODELATITUDELONGITUDELOCATIONON STREET NAMECROSS STREET NAMEOFF STREET NAMENUMBER OF PERSONS INJUREDNUMBER OF PERSONS KILLEDNUMBER OF PEDESTRIANS INJUREDNUMBER OF PEDESTRIANS KILLEDNUMBER OF CYCLIST INJUREDNUMBER OF CYCLIST KILLEDNUMBER OF MOTORIST INJUREDNUMBER OF MOTORIST KILLEDCONTRIBUTING FACTOR VEHICLE 1CONTRIBUTING FACTOR VEHICLE 2CONTRIBUTING FACTOR VEHICLE 3CONTRIBUTING FACTOR VEHICLE 4CONTRIBUTING FACTOR VEHICLE 5COLLISION_IDVEHICLE TYPE CODE 1VEHICLE TYPE CODE 2VEHICLE TYPE CODE 3VEHICLE TYPE CODE 4VEHICLE TYPE CODE 5
208548403/05/202420:40QUEENS11375.040.722622-73.849144(40.722622, -73.849144)YELLOWSTONE BOULEVARDGERARD PLACENaN0.00.0000000Driver Inattention/DistractionUnspecifiedNaNNaNNaN4707384SedanTractor Truck DieselNaNNaNNaN
208548503/05/20247:30NaNNaN40.772953-73.920280(40.772953, -73.92028)26 STREETHOYT AVENUE NORTHNaN0.00.0000000Turning ImproperlyDriver Inattention/DistractionNaNNaNNaN4707737Box TruckGarbage or RefuseNaNNaNNaN
208548603/05/202414:50NaNNaN40.646000-73.971750(40.646, -73.97175)CHURCH AVENUEEAST 8 STREETNaN2.00.0200000NaNNaNNaNNaNNaN4707432NaNNaNNaNNaNNaN
208548703/05/202414:00NaNNaN40.722250-74.005920(40.72225, -74.00592)CANAL STREETAVENUE OF THE AMERICASNaN1.00.0000010Following Too CloselyFollowing Too CloselyNaNNaNNaN4707476SedanNaNNaNNaNNaN
208548802/06/202412:37BROOKLYN11235.040.586670-73.966156(40.58667, -73.966156)OCEAN PARKWAYAVENUE ZNaN1.00.0100000UnspecifiedNaNNaNNaNNaN4707884E-BikeNaNNaNNaNNaN
208548903/05/202417:22QUEENS11436.040.680477-73.792100(40.680477, -73.7921)SUTPHIN BOULEVARDFOCH BOULEVARDNaN1.00.0000010Failure to Yield Right-of-WayUnspecifiedNaNNaNNaN4707511Station Wagon/Sport Utility VehicleStation Wagon/Sport Utility VehicleNaNNaNNaN
208549003/05/202417:00BROOKLYN11204.040.610786-73.978820(40.610786, -73.97882)NaNNaN161 AVENUE O1.00.0000010Driver InexperienceUnspecifiedUnspecifiedUnspecifiedNaN4707419AmbulancePKVanPKNaN
208549103/03/202417:50NaNNaN40.675053-73.947235(40.675053, -73.947235)SAINT MARKS AVENUENaNNaN1.00.0000010Aggressive Driving/Road RageUnspecifiedNaNNaNNaN4707855Station Wagon/Sport Utility VehiclePKNaNNaNNaN
208549203/05/202414:30BROOKLYN11207.040.677900-73.892586(40.6779, -73.892586)MILLER AVENUEFULTON STREETNaN1.00.0100000Pedestrian/Bicyclist/Other Pedestrian Error/ConfusionNaNNaNNaNNaN4707872Station Wagon/Sport Utility VehicleNaNNaNNaNNaN
208549303/05/20248:00QUEENS11385.040.706512-73.878136(40.706512, -73.878136)EDSALL AVENUE73 STREETNaN1.00.0000010Failure to Yield Right-of-WayUnspecifiedNaNNaNNaN4707447SedanStation Wagon/Sport Utility VehicleNaNNaNNaN